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ABSTRACT 


A  meta-analysis  was  performed  to  examine  the  generalizability  of  the  predictive  validity  of  the  Air  Force 
Officer  Qualifying  Test  (AFOQT]  operational  composites  against  technical  training  performance  for  14  non- 
rated  Air  Force  Specialties  (AFSs],  AFOQT  data  were  from  Form  Q  and  were  used  to  compute  composites 
based  on  Form  S  specifications.  All  five  operational  composites  (Verbal,  Quantitative,  Academic  Aptitude, 
Pilot,  and  Navigator/Technical]  were  included  in  the  analyses.  The  criterion  was  technical  training  final 
grade.  Analyses  began  by  examination  of  the  observed  correlation  between  the  AFOQT  composites  and 
technical  training  final  grade  for  each  officer  training  course.  The  meta-analysis  of  the  observed 
correlations  was  corrected  only  for  sampling  error.  The  observed  correlations  then  were  corrected  for 
range  restriction  using  the  multivariate  method  (Lawley,  1943]  and  the  meta-analysis  was  repeated.  The 
range-restriction  corrected  correlations  were  then  corrected  for  unreliability  (Hunter  &  Schmidt,  2004]  of 
the  test  scores  and  training  criterion  and  the  meta-analysis  was  repeated.  This  third  set  of  correlations 
provides  a  theoretical  estimate  of  the  predictiveness  of  the  composites  when  perfectly  reliable  measures 
are  available.  Ninety  percent  (63  out  of  70]  of  the  observed  correlations  between  the  AFOQT  composites 
and  average  officer  technical  training  grades  were  statistically  significant  at  or  beyond  the  .05  level,  thus 
supporting  its  general  value  in  selection  of  non-rated  officers.  Results  of  the  meta-analyses  of  the  observed 
correlations  indicated  that  the  predictive  validity  for  four  of  the  five  AFOQT  composites  (Verbal,  Academic 
Aptitude,  Pilot,  and  Navigator/  Technical]  was  not  the  same  across  the  non-rated  officer  training  specialties 
in  the  current  study.  Validity  generalization  was  observed  for  only  the  Quantitative  composite.  After 
correction  for  range  restriction,  four  of  the  five  composites  demonstrated  validity  generalization  across  the 
training  specialties.  This  means  that  the  true  validity  of  the  AFOQT  composites  (with  the  exception  of 
Verbal],  was  consistent  across  the  officer  training  specialties.  The  mean  validity  coefficients  for  the 
Quantitative  (.3499],  Academic  Aptitude  (.3878],  Pilot  (.3525],  and  Navigator/Technical  (.3796] 
composites  are  the  best  estimates  of  the  average  validity  across  all  officer  specialties.  Additional  efforts  to 
examine  the  generalizability  of  the  validity  of  AFOQT  composites  should  expand  to  include  a  broader  range 
of  occupational  specialties.  Expanding  the  breadth  of  training  specialties  would  allow  the  potential 
moderating  effect  of  occupational  similarity  to  be  examined. 
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GLOSSARY 


AF/A1PF 

Air  Force,  Force  Management  Policy  Division 

AFOQT 

Air  Force  Office  Qualifying  Test 

AFPC 

Air  Force  Personnel  Center 

AI 

Aviation  Information 

AR 

Arithmetic  Reasoning 

BC 

Block  Counting 

GS 

General  Science 

HF 

Hidden  Figures 

HRRD 

Human  Resources  Data  Bank 

IC 

Istrument  Comprehension 

MK 

Math  Knowledge 

OTS 

Officer  Training  School 

RB 

Rotated  Blocks 

ROTC 

Reserve  Officer  Training  Corps 

TR 

Table  Reading 

VA 

Verbal  Analogies 

WK 

Word  Knowledge 

PREFACE 


This  report  describes  activities  performed  in  support  of  USAF  personnel  selection  and 
classification  (AF/A1PF),  Work  Unit  2313HC58.  The  author  thanks  Mr.  Ken  Schwartz 
(AETC/DPSF)  and  the  Air  Force  Personnel  Center  (AFPC)  Human  Resources  Research  Data 
Bank  (HRRD)  for  support  in  the  development  of  the  database  used  in  this  study. 
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1.0  INTRODUCTION 


The  Air  Force  Office  Qualifying  Test  (AFOQT;  Carretta  &  Ree,  1996)  is  used  to  award 
U.S.  Air  Force  Reserve  Officer  Training  Corps  (ROTC)  scholarships  and  to  qualify  applicants 
for  officer  commissioning  through  the  ROTC  and  Officer  Training  School  (OTS)  programs.  The 
AFOQT  also  is  used  to  qualify  applicants  for  aircrew  training  as  pilots,  combat  system  operators 
(formerly  navigators),  and  air  battle  managers.  The  AFOQT  has  been  validated  against  officer 
training  performance  (Roberts  &  Skinner,  1996),  several  aircrew  training  performance  criteria 
including  passing/failing  training,  training  grades,  and  class  rank  (Carretta,  in  press;  Carretta  & 
Ree,  1995a,  2003;  Olea  &  Ree,  1994),  and  several  non-rated  officer  jobs  (Arth,  1986;  Arth  & 
Skinner,  1986;  Finegold  &  Rogers,  1985;  Hartke  &  Short,  1988). 

The  current  form  of  the  AFOQT  (Form  S)  was  operationally  implemented  in  2005  and 
consists  of  1 1  cognitive  subtests  and  an  experimental  personality  inventory.  For  operational  use, 
the  cognitive  subtests  are  combined  into  five  overlapping  composites  as  shown  in  Table  1.  The 
Verbal,  Quantitative,  and  Academic  Aptitude  composites  are  used  to  qualify  applicants  for 
ROTC  and  OTS  officer  commissioning  programs.  The  Pilot  and  Navigator/Technical  composites 
are  used  to  qualify  applicants  for  aircrew  training.  Air  Force  Instruction  36-2013  (United  States 
Air  Force,  2006)  provides  AFOQT  minimum  qualifying  score  requirements  for  officer 
commissioning  and  aircrew  training.  The  minimum  qualifying  scores  for  officer  commissioning 
are  at  least  the  15th  percentile  on  the  Verbal  composite  and  at  least  the  10th  percentile  on  the 
Quantitative  composite.  ROTC  and  OTS  aircrew  training  applicants  must  first  qualify  for  officer 
commissioning  meeting  minimum  requirements  for  the  AFOQT  Verbal  and  Quantitative 
composites.  In  addition,  they  must  meet  minimum  qualifying  scores  for  the  Pilot  and 
Navigator/Technical  composites.  The  minimum  qualifying  scores  for  aircrew  training  vary  by 
program,  commissioning  source,  and  for  pilot  training  whether  the  applicant  has  a  private  pilot’s 
certificate.  For  many  non-rated  officer  training  specialties1,  no  additional  AFOQT  requirements 
exist  other  than  the  minimum  requirements  for  officer  commissioning. 


1  Applicants  for  some  non-rated  officer  training  specialties  (e.g.,  medical  doctors,  dentists,  legal)  do  not  require 
qualification  on  the  basis  of  AFOQT  scores.  They  are  referred  to  as  non-line  officers.  Non-line  officer  specialties 
require  appropriate  college  degrees  and  training.  Upon  entry  into  the  Air  Force,  non-line  officers  complete  an 
abbreviated  officer  training  course. 
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Since  the  implementation  of  AFOQT  Form  O  in  1981,  the  development  and 
implementation  cycle  for  new  forms  has  been  about  seven  or  eight  years.  AFOQT  Form  S  was 
implemented  in  June  2005.  Two  lines  of  research  are  underway  during  the  current  AFOQT 
development  cycle.  The  first  is  focused  on  development  of  content  specifications  for  Form  T.  As 
part  of  this  effort,  focus  groups  are  being  conducted  with  Air  Force  officers  in  rated  and  non- 
rated  career  fields  to  identify  critical  knowledge,  skills,  abilities,  and  other  characteristics 
(KSAOs)  for  Air  Force  officer  and  technical  training  programs.  Responses  from  the  focus  groups 
will  be  used  to  develop  on-line  occupational  surveys  that  will  be  administered  to  approximately 
10,000  Air  Force  officers  to  determine  the  importance  of  the  KSAOs  to  career  success.  The 
results  will  be  used  to  guide  the  identification  of  constructs  to  supplement  existing  AFOQT 
content.  The  second  line  of  research  is  focused  on  the  evaluation  of  the  predictive  validity  of 
AFOQT  Form  S  versus  training  performance  (e.g.,  Carretta,  in  press).  To  this  end,  the  current 
study  examined  the  predictive  validity  of  the  AFOQT  composites  versus  training  performance  in 
several  non-rated  officer  specialties.  Results  will  provide  a  baseline  of  the  predictive  utility  of 
the  AFOQT  for  non-rated  specialties. 


2.0  METHOD 


2.1  Participants 

The  sample  consisted  of  10,542  USAF  officers  who  had  tested  on  the  AFOQT  Form  Q 
and  subsequently  attended  one  of  14  technical  training  courses.  The  training  courses  were 
Combat  Control  (13D1AB),  Airfield  Operations  (13M1),  Space  and  Missile  (13S1),  Space  and 
Missile  follow-on  (13S1X),  Intelligence  (14N1),  Weather  (15W1),  Aircraft  Maintenance  (21A1), 
Munitions  and  Munitions  Maintenance  -  Conventional  (21M1C),  Munitions  and  Munitions 
Maintenance  -  Non-conventional  (21M1NC),  Logistics  Readiness  (21R1),  Security  Forces 
(3  IP  1),  Communications-Infonnation  Systems  (33S1),  Communications  Officer  Engineering 
(33S3A),  and  Manpower  and  Personnel  (37F1).  Sample  sizes  ranged  from  16  (Combat  Control) 
to  2,190  (Communications-Information  Systems)  with  an  average  sample  size  of  753  students. 
The  criterion  was  final  technical  training  course  grade  which  is  based  on  several  written  tests  and 
ranged  from  70  to  100. 
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2.2  Measures 


Participants  tested  on  AFOQT  Form  Q,  which  consisted  of  16  cognitive  subtests.  When 
AFOQT  Form  S  was  implemented  in  July  2005,  five  of  the  subtests  from  previous  fonns  (O,  P, 
and  Q)  had  been  removed.  AFOQT  Form  S  consists  of  1 1  cognitive  subtests  that  are  combined 
into  five  composites  (see  Table  1).  For  the  purpose  of  this  study,  AFOQT  raw  score  composites 
were  computed  on  the  basis  of  the  Form  S  content  and  composite  specifications.  Personnel 
decisions  including  qualification  for  officer  commissioning  programs  and  aircrew  training  are 
made,  in  part,  on  the  basis  of  the  composites.  Brief  descriptions  of  the  AFOQT  subtests  grouped 
by  content  are  presented  below. 


Table  1.  Composition  of  AFOQT  Form  S  Aptitude  Composites 


Composite 


Academic 

Navigator/ 

Verbal 

Quantitative 

Aptitude 

Pilot 

Technical 

Subtest 

(V) 

(Q) 

(AA) 

(P) 

(N/T) 

Verbal  Analogies  (VA) 

X 

X 

X 

Arithmetic  Reasoning  (AR) 

'  X 

X 

X 

X 

Word  Knowledge  (WK) 

X 

X 

Math  Knowledge  (MK) 

X 

X 

X 

X 

Instrument  Comprehension  (IC) 
Block  Counting  (BC) 

X 

X 

Table  Reading  (TR) 

X 

X 

Aviation  Information  (AI) 

Rotated  Blocks  (RB) 

General  Science  (GS) 

X 

X 

Hidden  Figures  (HF) 

Self-Description  Inventory  (SDI+) 

Note.  Although  RB  and  HF  were  retained  in  AFOQT  Form  S,  they  do  not  contribute  to  any  of 
the  operational  composites.  The  SDI+  is  an  experimental  non-cognitive  subtest. 

Confirmatory  factor  analyses  of  the  AFOQT  Form  S  subtests  have  shown  it  to  measure 
general  intelligence  and  the  five  content-specific  factors  of  verbal,  quantitative,  spatial,  aviation 


3 


knowledge,  and  processing  speed  (Drasgow,  Nye,  Carretta,  &  Ree,  in  press).  Drasgow  et  al  (in 
press)  also  demonstrated  the  measurement  equivalence  of  the  AFOQT  across  gender  and 
racial/ethnic  subgroups.  These  results  are  consistent  with  analyses  of  the  previous  16  subtest 
form  (Carretta  &  Ree,  1995b,  1996).  The  reliabilities  for  the  five  composites  in  the  normative 
sample  are:  Verbal  (.91),  Quantitative  (92),  Academic  Aptitude  (.94),  Pilot  (.94),  and 
Navigator/Technical  (.95). 

Verbal  subtests.  Verbal  Analogies  (VA)  provides  a  measure  of  the  ability  to  reason  and 
determine  relationships  between  words.  Word  Knowledge  (WK)  assesses  verbal  comprehension 
involving  the  ability  to  understand  written  language  through  the  use  of  synonyms. 

Quantitative  subtests.  Arithmetic  Reasoning  (AR)  measures  the  ability  to  understand 
arithmetic  relations  expressed  as  word  problems.  Math  Knowledge  (MK)  provides  a  measure  of 
the  ability  to  use  mathematical  terms,  formulas,  and  relations. 

Spatial  subtests.  Block  Counting  (BC)  measures  spatial  ability  through  the  analysis  of 
three-dimensional  representations  of  a  set  of  blocks.  Rotated  Blocks  (RB)  assesses  the  ability  to 
visualize  and  mentally  manipulate  objects.  Hidden  Figures  (HF)  measures  the  ability  to  see  a 
simple  figure  embedded  in  a  complex  drawing. 

Aircrew  subtests.  Instrument  Comprehension  (IC)  assesses  the  ability  to  determine  the 
attitude  of  an  aircraft  from  illustrations  of  flight  instruments.  Aviation  Information  (AI)  measures 
knowledge  of  general  aviation  terms,  concepts,  and  principles.  General  Science  (GS)  provides  a 
measure  of  knowledge  and  understanding  of  scientific  terms,  concepts,  instruments,  and 
principles. 

Perceptual  speed  subtests.  Table  Reading  (TR)  assesses  the  ability  to  quickly  and 
accurately  extract  infonnation  from  tables. 
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2.3  Analyses 

When  conducting  a  meta-analysis  of  validities  across  several  studies,  it  is  desirable  to 
correct  for  artifacts  such  as  sampling  error,  range  restriction,  reliability,  recording  errors,  and 
others  that  may  contribute  to  variation  in  outcomes  across  studies.  The  extent  to  which 
corrections  can  be  made  is  detennined  by  the  availability  of  data  and  knowledge  about  the 
studies.  Hunter  and  Schmidt  (2004)  noted  that  even  if  all  artifacts  have  been  identified  and  if  all 
known  artifacts  are  controlled,  variation  in  study  outcomes  due  to  data  errors  would  still  occur. 
They  further  noted  that  in  actual  meta-analyses  attenuation  and  false  variation  caused  by 
uncontrolled  and  unknown  artifacts  occur  in  addition  to  variation  caused  by  bad  data.  These 
observations  led  Schmidt  and  Hunter  (1977)  to  propose  their  “75%  rule”  which  serves  as  a 
guideline  that  if  in  any  data  set,  known  and  correctable  artifacts  account  for  75%  of  the  variance 
in  study  correlations,  the  remaining  25%  of  the  variance  is  probably  due  to  uncontrolled  artifacts 
(e.g.,  study  differences  in  test  validity,  transcription  errors,  and  typographical  errors)  and  that  no 
substantive  variance  exists.  If  variance  due  to  sampling  error  across  the  studies  accounts  for  less 
than  75%  of  the  observed  variance,  the  possibility  of  moderator  variable  effects  exists. 

Whereas  previous  studies  of  the  predictiveness  of  AFOQT  scores  versus  perfonnance  in 
non-rated  officer  specialties  have  focused  on  the  Academic  Aptitude  composite  (Arth,  1986; 
Finegold  &  Rogers,  1985;  Hartke  &  Short,  1988),  the  current  study  examined  all  five 
composites.  Three  meta-analyses  were  perfonned  using  observed  correlations,  correlations 
corrected  for  range  restriction,  and  correlations  corrected  for  both  range  restriction  and 
unreliability  of  the  scores  and  criterion.  Analyses  began  by  examination  of  the  observed 
correlation  between  the  AFOQT  composites  and  technical  training  final  grade  for  each  officer 
training  course.  The  meta-analyses  of  the  observed  correlations  were  corrected  only  for  sampling 
error.  The  observed  correlations  then  were  corrected  for  range  restriction  using  the  multivariate 
method  (Lawley,  1943)  and  the  meta-analyses  were  repeated.  The  range-restriction  corrected 
correlations  were  then  corrected  for  unreliability  (Hunter  &  Schmidt,  2004)  of  the  test  scores  and 
training  criterion  (rc  =  —==.)  and  the  meta-analyses  were  repeated.  The  reliabilities  of  the 

y]rxxryy 

measures  being  correlated  affect  the  correlations.  The  upper  theoretical  limit  of  the  correlation 
between  any  two  measures  is  the  square  root  of  the  product  of  their  reliabilities.  This  third  set  of 
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correlations  provides  a  theoretical  estimate  of  the  predictiveness  of  the  composites  when 
perfectly  reliable  measures  are  available. 


3.0  RESULTS 


3.1  Observed  Correlations 

The  observed  correlations  analyses  are  summarized  in  Table  2.  Ninety  percent  (63  out  of 
70)  of  the  observed  correlations  between  the  AFOQT  composites  and  average  officer  technical 
training  grades  were  statistically  significant  at  or  beyond  the  .05  level.  The  exceptions  occurred 
for  two  of  the  three  smallest  samples,  Combat  Controller  (n  =  16)  and  Communications  Officer 
Engineering  (n  =  59).  The  weighted  mean  correlations  between  the  AFOQT  composites  and  final 
technical  training  course  grades  ranged  from  .2614  (Verbal)  to  .3265  (Academic  Aptitude).  The 
proportion  of  variance  accounted  for  by  sampling  error  was:  Verbal  (39.62%),  Quantitative 
(75.25%),  Academic  Aptitude  (71.64%),  Pilot  (61.59%),  and  Navigator/Technical  (72.29%).  Of 
the  five  composites,  only  the  Quantitative  composite  met  or  exceeded  Schmidt  and  Hunter’s 
(1977)  75%  threshold.  This  was  evidence  of  the  possible  existence  of  artifacts  affecting  the 
variability  of  the  correlations  and  that  the  true  validity  of  the  Verbal,  Academic  Aptitude,  Pilot, 
and  Navigator/Technical  composites  was  not  the  same  across  all  occupational  specialties  for  the 
training  courses  included  in  this  analysis.  Only  for  the  Quantitative  composite,  where  sampling 
variance  accounted  for  slightly  more  than  75%  of  the  observed  variance  around  the  weighted 
mean  validity,  can  it  be  concluded  that  its  validity  is  the  same  for  all  14  technical  training 
courses  and  that  observed  variance  in  the  observed  validities  is  due  to  artifacts. 
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Table  2.  Correlations  between  AFOQT  Composites  and  Officer  Technical  Training 
Grades:  Observed 


Air  Force  Specialty 

Course 

N 

V 

AFOQT  Composite 

Q  AA  P  N/T 

Combat  Control 

13D1AB 

16 

.271 

.381  .404  .462*  .484* 

Airfield  Operations 

13M1 

251 

.266** 

.368**  .369**  .382**  .385** 

Space  &  Missile 

13S1 

1638 

.323** 

.386**  .426**  .405**  .427** 

Space  &  Missile  - 

13S1X 

345 

.347** 

.407**  .435**  .368**  .412** 

Follow-on  Course 

Intelligence 

14N1 

1983 

.266** 

.324**  .353**  .300**  .341** 

Weather 

15W1 

294 

.225** 

.277**  .311**  .232**  .228** 

Aircraft  Maintenance 

21A1 

1430 

.243** 

227**  277**  274**  274** 

Munitions  &  Munitions 

21M1C 

42 

.318* 

.377**  .418**  .465**  .398** 

Maintenance  -  Conventional 

Munitions  &  Munitions 

21M1NC 

246 

.118* 

.238**  .211**  .277**  .288** 

Maintenance  -  Non-Conventional 

Logistics  Readiness 

21R1 

1130 

.257** 

243**  294**  253**  284** 

Security  Forces 

31P1 

599 

.295** 

179**  28**5  215**  257** 

Communications- 

33S1 

2190 

.235** 

.250**  .286**  .259**  .282** 

Information  Systems 

Communications  Officer 

33S3A 

59 

.035 

.059  .057  .231*  .071 

Engineering 

Manpower  &  Personnel 

37F1 

319 

.216** 

.338**  .331**  .297**  .339** 

Weighted  Mean  (All  AFSs) 

10542 

.2614 

.2878  .3265  .2965  .3199 

95%  Cl  (upper) 

10542 

.2793 

.3053  .3436  .3139  .3371 

95%  Cl  (lower) 

10542 

.2437 

.2703  .3095  .2792  .3028 

*p<  .05;  **p<  .01 
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3.2  Range-Restriction  Corrected  Correlations 

The  observed  correlations  were  corrected  for  range  retraction  (Lawley,  1943)  to  provide  a 
better  statistical  estimate  of  the  true  relationship  between  the  test  scores  and  training 
performance.  The  analyses  of  the  range-restriction  corrected  correlations  are  summarized  in 
Table  3.  Most  correlations  increased  after  correction  for  range  restriction.  It  should  be  noted  that 
in  a  few  instances  (5  of  70)  the  correlations  decreased  in  magnitude  after  correction  for  range 
restriction  (Combat  Control,  Communications  Officer  Engineering).  This  is  unusual,  but  can 
occur  when  the  correction  leads  to  a  reduction  in  variance  (Levin,  1972).  As  expected,  after 
correction  for  range  restriction,  the  weighted  mean  correlations  between  the  AFOQT  composites 
and  final  technical  training  course  grades  increased  for  all  five  composites.  The  corrected 
weighted  mean  correlations  ranged  from  .3222  (Verbal)  to  .3878  (Academic  Aptitude).  The 
proportion  of  variance  accounted  for  by  sampling  error  also  increased  for  all  five  composites 
after  correction  for  range  restriction.  The  values  were:  Verbal  (62.91%),  Quantitative  (79.27%), 
Academic  Aptitude  (78.00%),  Pilot  (73.97%),  and  Navigator/Technical  (75.61%).  After 
correction,  the  Quantitative,  Academic  Aptitude,  Navigator/Technical  met  or  exceeded  Schmidt 
and  Hunter’s  (1977)  75%  threshold,  and  the  Pilot  composite  was  only  slightly  below  it.  Thus, 
with  the  exception  of  the  Verbal  composite,  the  predictive  validity  of  the  AFOQT  composites 
was  the  same  for  all  14  technical  training  courses  and  the  observed  variance  in  the  range- 
restriction  corrected  validities  can  be  attributed  to  artifacts. 
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Table  3.  Correlations  between  AFOQT  Composites  and  Officer  Technical  Training 
Grades:  Range-Restriction  Corrected 


Air  Force  Specialty 

Course 

N 

V 

AFOQT  Composite 

Q  AA  P 

N/T 

Combat  Control 

13D1AB 

16 

.512 

.146 

.372 

.134 

.197 

Airfield  Operations 

13M1 

251 

.286 

.389 

.393 

.407 

.410 

Space  &  Missile 

13S1 

1638 

.398 

.459 

.496 

.469 

.494 

Space  &  Missile 

13S1X 

345 

.375 

.435 

.468 

.402 

.446 

Follow-On  Course 

Intelligence 

14N1 

1983 

.338 

.388 

.420 

.365 

.407 

Weather 

15W1 

294 

.336 

.450 

.456 

.404 

.407 

Aircraft  Maintenance 

21A1 

1430 

.310 

.297 

.349 

.336 

.341 

Munitions  &  Munitions 

21M1C 

42 

.444 

.442 

.511 

.496 

.442 

Maintenance  -  Conventional 

Munitions  &  Munitions 

21M1NC 

246 

.158 

.301 

.269 

.308 

.330 

Maintenance  -  Non-Conventional 

Logistics  Readiness 

21R1 

1130 

.307 

.297 

.348 

.296 

.332 

Security  Forces 

31P1 

599 

.355 

.255 

.347 

.282 

.324 

Communications- 

33S1 

2190 

.282 

.293 

.331 

.295 

.324 

Information  Systems 

Communications  Officer 

33S3A 

59 

-.005 

.160 

.095 

.333 

.188 

Engineering 

Manpower  &  Personnel 

37F1 

319 

.278 

.398 

.393 

.366 

.404 

Weighted  Mean  (All  AFSs) 

10542 

.3222 

.3499 

.3878 

.3525 

.3796 

95%  Cl  (upper) 

10542 

.3393 

.3666 

.4040 

.3692 

.3959 

95%  Cl  (lower) 

10542 

.3051 

.3332 

.3716 

.3358 

.3633 

Note.  No  tests  for  statistical  significance  were  performed  for  the  corrected  correlations. 
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3.3  Range-Restriction  and  Reliability  Corrected  Correlations 


As  previously  noted,  the  upper  theoretical  limit  of  the  correlation  between  any  two 
measures  is  the  square  root  of  the  product  of  their  reliabilities.  Correcting  the  correlations 
between  the  AFOQT  composites  and  technical  training  scores  for  unreliability  (attenuation) 
provides  a  theoretical  estimate  of  the  predictiveness  of  the  composites  if  perfectly  reliable 
measures  were  available.  Reliability  estimates  for  the  AFOQT  composites  were  from  the  Form  S 
nonnative  sample  and  were  based  on  the  Wherry  and  Gaylord  (1943)  procedure.  The  estimates 
were:  Verbal  (.91),  Quantitative  (.92),  Academic  Aptitude  (.94),  Pilot  (.94),  and 
Navigator/Technical  (.95).  The  reliability  of  the  final  technical  training  grades  was  estimated  to 
be  .80. 


The  analyses  of  the  correlations  after  correction  for  both  range-restriction  and 
unreliability  of  the  scores  and  criterion  are  summarized  in  Table  4.  Correcting  the  conelations 
for  both  range  restriction  and  unreliability  increased  their  magnitudes  above  those  corrected  only 
for  range  restriction.  The  corrected  weighted  mean  correlations  between  the  AFOQT  composites 
and  final  technical  training  course  grades  ranged  from  .3776  (Verbal)  to  .4476  (Academic 
Aptitude).  The  proportion  of  variance  accounted  for  by  sampling  error  did  not  change  from  the 
previous  analyses  where  the  correlations  were  corrected  for  range  restriction  only.  This  is 
because  the  same  reliability  estimates  were  used  for  all  samples.  Thus,  when  the  correction  for 
attenuation  was  applied  the  validities  for  each  composite  were  corrected  by  the  same  proportion. 
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Table  4.  Correlations  between  AFOQT  Composites  and  Officer  Technical  Training 
Grades:  Range-Restriction  and  Reliability  Corrected 


Air  Force  Specialty 

Course 

N 

V 

AFOQT  Composite 

Q  AA  P 

N/T 

Combat  Control 

13D1AB 

16 

.600 

.170 

.428 

.154 

.225 

Airfield  Operations 

13M1 

251 

.335 

.453 

.453 

.469 

.470 

Space  &  Missile 

13S1 

1638 

.466 

.535 

.571 

.540 

.566 

Space  &  Missile 

13S1X 

345 

.466 

.507 

.539 

.463 

.511 

Follow-On  Course 

Intelligence 

14N1 

1983 

.396 

.452 

.484 

.420 

.466 

Weather 

15W1 

294 

.393 

.524 

.525 

.465 

.466 

Aircraft  Maintenance 

21A1 

1430 

.363 

.346 

.402 

.387 

.391 

Munitions  &  Munitions 

21M1C 

42 

.520 

.515 

.589 

.571 

.507 

Maintenance  -  Conventional 

Munitions  &  Munitions 

21M1NC 

246 

.185 

.350 

.310 

.355 

.378 

Maintenance  -  Non-Conventional 

Logistics  Readiness 

21R1 

1130 

.359 

.346 

401 

.341 

.380 

Security  Forces 

31P1 

599 

.416 

.297 

.400 

.325 

.371 

Communications- 

33S1 

2190 

.330 

.341 

.381 

.340 

.371 

Information  Systems 

Communications  Officer 

33S3A 

59 

-.005 

.186 

.109 

.384 

.215 

Engineering 

Manpower  &  Personnel 

37F1 

319 

.325 

.463 

.453 

.422 

.463 

Weighted  Mean  (All  AFSs) 

10542 

.3776 

.4081 

.4476 

.4071 

.4359 

95%  Cl  (upper) 

10542 

.3977 

.4277 

.4663 

.4264 

.4546 

95%  Cl  (lower) 

10542 

.3575 

.3886 

.4289 

.3878 

.4171 

Note.  No  tests  for  statistical  significance  were  performed  for  the  corrected  correlations. 
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4.0  DISCUSSION 


Results  of  the  meta-analyses  of  the  observed  correlations  indicated  that  the  predictive 
validity  for  four  of  the  five  AFOQT  composites  (Verbal,  Academic  Aptitude,  Pilot,  and 
Navigator/  Technical)  was  not  the  same  across  the  non-rated  officer  training  specialties  in  the 
current  study.  Validity  generalization  was  observed  for  only  the  Quantitative  composite.  The 
lack  of  generalizability  for  the  observed  correlations  is  consistent  with  results  of  a  bare  bones 
meta-analysis  that  examined  the  generalizability  of  the  validity  of  the  AFOQT  Academic 
Aptitude  composite  for  47  validity  coefficients  involving  officer  technical  training  grades 
(Hartke  &  Short,  1988).  Hartke  and  Short  observed  that  though  the  validity  of  the  Academic 
Aptitude  composite  varied  across  the  officer  training  specialties,  it  demonstrated  usefulness  for 
nearly  all  of  them.  A  similar  trend  was  observed  in  the  current  analyses.  However,  the  current 
analyses  extended  those  reported  by  Hartke  and  Short,  in  that  the  predictive  validity  of  all  five 
composites  was  examined,  not  just  Academic  Aptitude.  Though  the  validity  of  the  composites 
varied  (i.e.,  was  not  generalizable)  across  the  14  officer  training  specialties,  their  usefulness  was 
demonstrated  for  non-rated  officer  technical  training.  Ninety  percent  (63  out  Of  70)  of  the 
correlations  between  the  composites  and  the  training  criterion  were  statistically  significant. 

Hartke  and  Short  (1988)  were  unable  to  correct  their  data  for  the  statistical  artifacts  of 
range  restriction  or  unreliability  due  to  a  lack  of  information  about  the  studies  in  their  analyses. 
The  current  analyses  corrected  for  both  range  restriction  and  unreliability  of  the  scores.  After 
correction  for  range  restriction,  four  of  the  five  composites  demonstrated  validity  generalization 
across  the  training  specialties.  This  means  that  the  true  predictive  validity  of  the  AFOQT 
composites  (with  the  exception  of  Verbal),  was  consistent  across  the  officer  training  specialties. 
The  mean  validity  coefficients  for  the  Quantitative  (.3499),  Academic  Aptitude  (.3878),  Pilot 
(.3525),  and  Navigator/Technical  (.3796)  composites  are  the  best  estimates  of  the  average 
validity  across  all  officer  specialties. 

It  is  interesting  to  note  that  though  the  composites  differ  in  composition  (see  Table  1), 
there  was  little  difference  in  their  mean  predictive  validity.  If  the  Verbal  composite  (mean 
weighted  validity  =  .3222)  were  included  despite  its  lack  of  generalizability,  the  range  in  mean 
weighted  validities  was  only  .0656  (.3878  -  .3222  =  .0656). 
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Additional  efforts  to  examine  the  generalizability  of  the  validity  of  AFOQT  composites 
should  expand  to  include  a  broader  range  of  occupational  specialties.  Expanding  the  breadth  of 
training  specialties  would  allow  the  potential  moderating  effect  of  occupational  similarity  to  be 
examined.  Occupational  subgroups  could  be  defined  on  the  basis  of  task  characteristics  of  the 
training  specialties.  Sorting  training  specialties  into  homogeneous  subgroups  could  strengthen 
validity  generalization.  Presumably,  occupations  with  similar  task  characteristics  and  training 
also  would  be  more  similar  in  their  aptitude  requirements.  Unfortunately,  the  current  analyses 
had  too  few  training  specialties  to  allow  for  division  into  homogeneous  subgroups. 

Although  the  meta-analytic  results  generally  supported  the  predictiveness  of  the  AFOQT 
composites  across  several  officer  training  specialties,  results  indicate  there  is  reliable  variance  in 
training  performance  not  being  predicted  by  the  AFOQT  composites.  Even  after  correction  for 
range  restriction  and  unreliability,  the  validities  of  the  composites  ranged  from  .3776  (Verbal)  to 
.4476  (Academic  Aptitude).  One  way  to  improve  predictive  validity  would  be  to  identify  content 
areas  not  currently  covered  by  the  AFOQT  that  could  account  for  additional  reliable  variance  in 
training  performance.  As  previously  noted,  efforts  have  begun  to  identify  critical  knowledge, 
skills,  abilities,  and  other  characteristics  for  Air  Force  officer  and  technical  training  programs. 
The  results  will  be  used  to  guide  the  identification  of  constructs  to  supplement  existing  AFOQT 
content. 

Finally,  as  noted  earlier,  currently  there  are  no  AFOQT  requirements  for  non-rated  officer 
training  qualification  beyond  qualifying  for  an  officer  commissioning  program.  The  current 
results  suggest  that  further  studies  should  be  conducted  to  examine  the  utility  of  minimum 
AFOQT  qualifying  scores  for  non-technical  training  specialties,  including  their  effect  on  training 
performance  and  subgroup  qualification  rates  (e.g.,  adverse  impact). 
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